Presentation Of The Eurolang Project
نویسندگان
چکیده
International trade in general and particularly the Single European Market will bring about a considerable increase in the already huge documentation market. NLP products will contribute to improving the competitiveness of industry in this strategic field. The indaslaial objective of EUROLANG is thus to provide efficient NLP tools and to give the European business community a better opportunity to maintain a command of multilingual technical and commercial communication. The technical objective of the EUROLANG project is to build an MT/NLP toolbox, offering a wide range of 'open' and powerful tools, which reflect the state-of-the-art in computing and linguistic techniques. These tools will then be validated via the production of a mnltilingunl MT system based on second generation principles. With respect to the computing aspects, the main technical choices are : portability, maintainability, openness, possibility of evolution and ergonomy. This implies the u ~ of standard techniques and tools (UNIX, Xl l , MOTIF, C, SQL, SGML). The linguistic developments, based on a syntacticn-semantic analysis, will follow an industrial methodology, implying formal linguistic specifications. The use of speciatised languages will ensure a better separation between software and lingware, and thus better modularity. The three-phase translation process guarantees the multilingual aspect of the MT system. I N T R O D U C T I O N In the light of the communication society, which is leading industrial companies to become involved in NLP technology, the EUROLANG parlnexs have decided to pool their technical, human and commercial resources in order to define and develop a new 'high quality' ] 'low cost' MT system. This system will be based on state-of-the-art computing and linguistic techniques, and is aimed at the market for quality translation with post-editing. To ensure the reusability of the linguistic resources and the possibility of evolution of the system, a powerful toolbox, providing different NLP tools and useful 'plug and play' functionalities, will be developed. This toolbox will be a linguistic plaffoml providing a usex-liiendly environment to design and davelop different kinds of NLP applications. In the following paper, we present more specifically the aims and motivations of the project and the main technical choices, with regard to both computing and linguistic issues. G E N E R A L P R E S E N T A T I O N A i m s a n d m o t i v a t i o n s o f E U R O L A N G The technical objective of the project is to build an MT/NLP toolbox, offering a wide range of 'open' and powerful tools, which reflect the state-of-the-art in computing and linguistic techniques. These tools will then be validated via the production of a multilingual MT system 'based on second generation principles. This system will process five European languages (English, French, German, Italinn, Spanish). Only ten language pairs (chosen according to the market and partner needs) will be dealt with : the eight pairs involving English language, and the two French/German pairs. The dictionaries will contain 50 000 terms in each language, and their equivalents in the other 'languages. The general objective is to yield products at the end of the project, which will be targeted on the 'low cost' / 'high quality' market. This goal is made possible by the 1991 state-of-the-art MT technology and will considerably increase the share of the nmrket currently occupied by the commercial MT systems. The market is estimated at 12 billion dollars (Gartaer Group figures), and is rapidly increasing. The following trends suggest a boom in this market : continuous increase in intematiounl trade, enormous need within the Single European Market, shortage and ever increasing cost of qualified Iranslators, ACTES DE COLING-92, NANTES, 23-28 AO~ 1992 l 2 8 9 PROC. OF COLING-92, N^NI"ES, AUt~. 23-28, 1992 strategic need for a company involved in a competitive export market to possess high quality trauslatinns of techuical and commercial documents rapidly available. The Japanese were the fhst to appreciate the strategic importance of substantial investment in this field. The major coqx)ratious invest considerable sums arid participate in projects such as EDR, ATR... The state-of-the-art in computing and linguistic technologies associated with skills to deM with different European languages seems to be a good opportunity for Europe to I>ecome a leader in this sector. The linguistic quality of a translation is of course one of the major criteria involved in the evaluating of nn MT system, however there are others (cf. [Roudaud 91]). Some linguistic phenomena are very complex, and thus very costly, to deal with, whereas another technical solution, less resource-consuming, could lead to an equivalent (or even better) result, as far as efficiency is concerned. The industrial issue implies that a compromise must be reached, betwean cost, efficiency and linguistic quality : the right solution to the right problem. For example, anaphora are a very difficult linguistic problem to solve, whereas in teclmical docun)eutation the French pronoun il can more often he translated iu English by it. In such it case, a better solntion would be to propose it, as the default translation, and he and she as alternatives, so that the revisor can choose one or the other. This last point shows that we must adopt a global approach for the design of an MT system. Not only is the MT kernel of major importance, but also a user+fiiendly environment, providing many useful tools fi~r the end-user (IJ'anshator, writer,...), must be desigued. Tiffs is the reason why special attention is paid in this project to rite design of the preand IX)st-editing tools. For users, an MT system is only a part of the docume,ttary system and integration and connectiou with other tools must be Ioreseen. It is well-known that second geueration MT systems itre very time-consuming. Predictions for the coming years show that workstations will very rapidly )'each 50 or 100 MIPS, and that the cost of MIPS will continue to decrease. All these factors coulribute to the reduction of exploitation costs of MT systems and to the improvemeut of delivery fime/vohane nitio. O r g a n i s a t i o n a n d m a i n t e c h n i c a l c h o i c e s EUROLANG is a three-year EUREKA project which began in I~'cember '91. The first year will mainly consist of specifying both computer and linguistic developments. The two following years will be devoted to development, integrafiou, tests and evaluation. The global cost of the project is 684 MFF (about 14{)M US$). It involves five European countries: France, Germany, Italy, Spain, United Kingdom. The partners of the project are : in France : SITE group (prime contractor) CAP GEMINI INNOVATION o CNET
منابع مشابه
In other words: using paraphrases in translation
This paper is inspired by an event that took place 15 years ago. Recently graduated, I had eagerly started up a small consultancy company in Sweden with the aim of producing new translation software. The first collaborative project I was involved in was to offer consultancy around a translation memory called Eurolang Optimizer, where my task would involve producing Swedish localisation. The tra...
متن کاملAssessing the Effective Factors on JudgingArchitectural Design Projects of UniversityStudents
Since nature of architecture is quite different from that of other fields of study, the criteria used forjudging projects in this field continuously undergo some changes. On the other hand, because of the undeniable role ofhuman factors in judging the projects, there is always a percentage of deviation from the standard given the interestsof the judging committee. The standards of judging archi...
متن کاملLarge Lexical European Projects and the MultilinguaI Aspect
We aim at providing an overview and a comparison of multilingual and Machine Translation (MT) related issues that emerge and that are handled within the major European projects in the lexical area (ET-7, Acquilex I and If, Multilex, Genelex, ET-10 Semantic Analysis of Cobuild, ET-10 Collocations, ET-10 Statistical Text-corpora based complements for Eurotra, Delis, Eurolang and Eagles), most of ...
متن کاملPrioritizing Organizational Projects with Fuzzy Data Envelopment Analysis Approach (Case Study of Educational Research Plans)
The right choice of research projects is one of the most important effective ways to increase the productivity of research projects and allocate the resources appropriately. Training and education is one of the most important institutions for educating the students and prosperous people of the country. Due to the limited resources of this organization and, on the other hand, the expansion of sc...
متن کاملComparative Study of the Iranian nursing bachelor's degree program with the International Islamic University of Malaysia
Introduction: There are different nursing education programs in each country, which are rooted in their traditions. Comparing different educational programs will enhance the content and quality of the curriculum. This study aimed to compare the Iranian nursing bachelor program with Islamic International University of Malaysia. Methods: This descriptive-comparative study was conducted in 2018 ...
متن کامل